Skip to main content

Lab 2 - Library Perspective

Task: Common Functions

Enter the chapters/software-stack/libc/drills/tasks/common-functions/ folder, run make skels, then enter support/. Go through the practice items below.

  1. Update os_string.c and os_string.h to make available the os_strcat() function that performs the same string concatenation as strcat() from libc. Check your implementation by running make check in support/tests/. If some of the tests fail, start debugging from the file that calls os_strcat(): test.c.

  2. Update the main_printf.c file to use the implementation of sprintf() to collect information to be printed inside a buffer. Call the write() function to print the information. The printf() function will no longer be called. This results in a single write() system call.

    Using previously implemented functions allows us to more efficiently write new programs. These functions provide us with extensive features that we use in our programs.

  1. Update the putchar() function in main_printf.c to implement a buffered functionality of printf(). Characters passed via the putchar() call will be stored in a predefined static global buffer. The write() call will be invoked when a newline is encountered or when the buffer is full. This results in a reduced number of write system calls. Use strace to confirm the reduction of the number of write system calls.

  2. Update the main_printf.c file to also feature a flush() function that forces the flushing of the static global buffer and a write system call. Make calls to printf() and flush() to validate the implementation. Use strace to inspect the write() system calls invoked by printf() and flush().

If you're having difficulties solving this exercise, go through this reading material.

Task: Libraries and libc

Enter the chapters/software-stack/libc/libc/drills/tasks/ folder, run make skels, then enter support/. Now go through the practice items below.

  1. Use malloc() and free() functions in the memory.c program. Make your own use of the allocated memory.

    It's very easy to use memory management functions with the libc. The alternative (without the libc) would be more cumbersome.

    Use different values for malloc(), i.e. the allocation size. Use strace to check the system calls invoked by malloc() and free(). You'll see that, depending on the size, the brk() or mmap() / munmap() system calls are invoked. And for certain calls to malloc() / free() no syscall is happening. You'll find more about them in the Data chapter.

  1. Create your own C program with calls to the standard C library in vendetta.c. Be as creative as you can about the types of functions being made.
  1. Inside the vendetta.c file make a call open("a.txt", O_RDWR | O_CREAT, 0644) to open / create the a.txt file. Make sure you include all required headers. Check the system call being made.

    Make an fopen() with the proper arguments that gets as close as possible to the open() call, i.e. the system call arguments are as close as possible.

  2. Inside the vendetta.c file make a call to sin() function (for sine). Compute sin(0) and sin(PI/2).

If you're having difficulties solving this exercise, go through this reading material.

Task: High-Level Languages

Enter the chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/ folder, run make skels, then enter spport/ Then go through the practice items below.

  1. Use make to create the hello executable from the hello.go file (a Go "Hello, World!"-printing program). Use ltrace and strace to compute the number of library calls and system calls. Use perf to measure the running time.

    Compare the values with those from the "Hello, World!"-printing programs in C and Python.

  2. Create a "Hello, World!"-printing program in a programming language of your choice (other than C, Python and Go). Find the values above (library calls, system calls and running time).

  1. Create programs in C, Python and Go that compute the N-th Fibonacci number. N is passed as a command-line argument. Run the checker (make check in the high-level-lang/solution/tests/ folder) to check your results.

    Use ltrace and strace to compute the number of library calls and system calls. Use perf to measure the running time.

    Compare the values of the three programs.

  2. Create programs in C, Python and Go that copy a source file into a destination file. Both files are passed as the two command-line arguments for the program. Run the checker (make check in the high-level-lang/support/tests/ folder) to check your results.

    Sample run:

    student@so:~/.../solution/tests/$ make check
    make -C ../src
    make[1]: Entering directory '/media/teo/1TB/Poli/Asistent/SO/operating-systems/chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/solution/src'
    go build -ldflags '-linkmode external -extldflags "-dynamic"' hello.go
    cc -z lazy fibo.c -o fibo
    go build -o fibo_go -ldflags '-linkmode external -extldflags "-dynamic"' fibo.go
    cc -z lazy copy.c -o copy
    go build -o copy_go -ldflags '-linkmode external -extldflags "-dynamic"' copy.go
    make[1]: Leaving directory '/media/teo/1TB/Poli/Asistent/SO/operating-systems/chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/solution/src'
    Fibonacci [C] -- fibo(10) == 55 -- PASSED
    Fibonacci [C] -- fibo( 5) == 5 -- PASSED
    Fibonacci [C] -- fibo(20) == 6765 -- PASSED
    Fibonacci [Python] -- fibo(10) == 55 -- PASSED
    Fibonacci [Python] -- fibo( 5) == 5 -- PASSED
    Fibonacci [Python] -- fibo(20) == 6765 -- PASSED
    Fibonacci [Go] -- fibo(10) == 55 -- PASSED
    Fibonacci [Go] -- fibo( 5) == 5 -- PASSED
    Fibonacci [Go] -- fibo(20) == 6765 -- PASSED
    Copy [C] -- PASSED
    Copy [Python] -- PASSED
    Copy [Go] -- PASSED

    Use ltrace and strace to compute the number of library calls and system calls. Use perf to measure the running time. Use source files of different sizes. Compare the outputs of these commands on the three programs.

If you're having difficulties solving this exercise, go through this reading material.

Task: App Investigation

Enter the chapters/software-stack/applications/drills/tasks/app-investigation/support/ folder and go through the practice items below. Select a binary executable application and a scripted application.

  1. Use ldd on the two applications. Notice the resulting messages and explain the results.

  2. Use ltrace and strace on the two applications. Follow the library calls and the system calls done by each application.

  3. Check to see whether there are statically-linked application executables in the system. The file command tells if the file passed as argument is a statically-linked executable. If you can't find one, install the busybox-static package.

  4. Look into what busybox is and explain why it's custom to have it as statically-linked executable.

  5. Run ldd, nm, strace, ltrace on a statically-linked application executable. Explain the results.

If you're having difficulties solving this exercise, go through this reading material.

Guide: Statically-linked and Dynamically-linked Libraries

Libraries can be statically-linked or dynamically-linked, creating statically-linked executables and dynamically-linked executables. Typically, the executables found in modern operating systems are dynamically-linked, given their reduced size and ability to share libraries at runtime.

The chapters/software-stack/libraries/guides/static-dynamic/support/ folder stores the implementation of a simple "Hello, World!"-printing program that uses both static and dynamic linking of libraries. Let's build and run the two executables:

student@os:~/.../static-dynamic/support$ ls
hello.c Makefile

student@os:~/.../static-dynamic/support$ make
cc -Wall -c -o hello.o hello.c
cc hello.o -o hello
cc -static -o hello_static hello.o

student@os:~/.../static-dynamic/support$ ls -lh
total 852K
-rwxrwxr-x 1 razvan razvan 8.2K Aug 2 15:53 hello
-rw-rw-r-- 1 razvan razvan 76 Aug 2 15:51 hello.c
-rw-rw-r-- 1 razvan razvan 1.6K Aug 2 15:53 hello.o
-rwxrwxr-x 1 razvan razvan 827K Aug 2 15:53 hello_static
-rw-rw-r-- 1 razvan razvan 237 Aug 2 15:53 Makefile

student@os:~/.../static-dynamic/support$ ./hello
Hello, World!

student@os:~/.../static-dynamic/support$ ./hello_static
Hello, World!

The two executables (hello and hello_static) behave similarly, despite having vastly different sizes (8.2K vs. 827K - 100 times larger).

We use nm and ldd to catch differences between the two types of resulting executables:

student@os:~/.../static-dynamic/support$ ldd hello
linux-vdso.so.1 (0x00007ffc8d9b2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f10d1d88000)
/lib64/ld-linux-x86-64.so.2 (0x00007f10d237b000)

student@os:~/.../static-dynamic/support$ ldd hello_static
not a dynamic executable

student@os:~/.../static-dynamic/support$ nm hello | wc -l
33

student@os:~/.../static-dynamic/support$ nm hello_static | wc -l
1674

The dynamic executable references the dynamically-linked libc library (/lib/x86_64-linux-gnu/libc.so.6), while the statically-linked executable has no references. Also, given the statically-linked executable integrated entire parts of statically-linked libraries, there are many more symbols than in the case of a dynamically-linked executable (1674 vs. 33).

We can use strace to see that there are differences in the preparatory system calls for each type of executables. For the dynamically-linked executable, the dynamically-linked library (/lib/x86_64-linux-gnu/libc.so.6) is opened during runtime:

student@os:~/.../static-dynamic/support$ strace ./hello
execve("./hello", ["./hello"], 0x7ffc409c6640 /- 66 vars */) = 0
brk(NULL) = 0x55a72eda6000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=198014, ...}) = 0
mmap(NULL, 198014, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3136a41000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\35\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2030928, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3136a3f000
mmap(NULL, 4131552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3136458000
mprotect(0x7f313663f000, 2097152, PROT_NONE) = 0
mmap(0x7f313683f000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f313683f000
mmap(0x7f3136845000, 15072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3136845000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7f3136a404c0) = 0
mprotect(0x7f313683f000, 16384, PROT_READ) = 0
mprotect(0x55a72d1bb000, 4096, PROT_READ) = 0
mprotect(0x7f3136a72000, 4096, PROT_READ) = 0
munmap(0x7f3136a41000, 198014) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 18), ...}) = 0
brk(NULL) = 0x55a72eda6000
brk(0x55a72edc7000) = 0x55a72edc7000
write(1, "Hello, World!\n", 14Hello, World!
) = 14
exit_group(0) = ?
+++ exited with 0 +++

student@os:~/.../static-dynamic/support$ strace ./hello_static
execve("./hello_static", ["./hello_static"], 0x7ffc9fd45400 /- 66 vars */) = 0
brk(NULL) = 0xff8000
brk(0xff91c0) = 0xff91c0
arch_prctl(ARCH_SET_FS, 0xff8880) = 0
uname({sysname="Linux", nodename="yggdrasil", ...}) = 0
readlink("/proc/self/exe", "/home/razvan/school/so/operating"..., 4096) = 116
brk(0x101a1c0) = 0x101a1c0
brk(0x101b000) = 0x101b000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 18), ...}) = 0
write(1, "Hello, World!\n", 14Hello, World!
) = 14
exit_group(0) = ?
+++ exited with 0 +++

Similarly, we can investigate a system executable (/bin/ls) to see that indeed all referenced dynamically-linked libraries are opened (via the openat system call) at runtime:

student@os:~/.../static-dynamic/support$ ldd $(which ls)
linux-vdso.so.1 (0x00007ffc3bdf3000)
libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f092bd88000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f092b997000)
libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f092b726000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f092b522000)
/lib64/ld-linux-x86-64.so.2 (0x00007f092c1d2000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f092b303000)

student@os:~/.../static-dynamic/support$ strace -e openat ls
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libpcre.so.3", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
community docs _index.html search.md
+++ exited with 0 +++

Common Functions

By using wrapper calls, we are able to write our programs in C. However, we still need to implement common functions for string management, working with I/O, working with memory.

The simple attempt is to implement these functions (printf() or strcpy() or malloc()) once in a C source code file and then reuse them when needed. This saves us time (we don't have to reimplement) and allows us to constantly improve one implementation constantly; there will only be one implementation that we update to increase its safety, efficiency or performance.

Go to chapters/software-stack/libc/drills/tasks/common-functions/ and run make skels. The support/ folder stores the implementation of string management functions, in os_string.c and os_string.h and of printing functions in printf.c and printf.h. The printf() implementation is this one.

There are two programs: main_string.c showcases string management functions, main_printf.c showcases the printf() function.

main_string.c depends on the os_string.h and os_string.c files that implement the os_strlen() and os_strcpy() functions. We print messages using the write() system call wrapper implemented in syscall.s

Let's build and run the program:

student@os:~/.../common-functions/support/src$ make main_string
gcc -fno-PIC -fno-stack-protector -c -o main_string.o main_string.c
gcc -fno-PIC -fno-stack-protector -c -o os_string.o os_string.c
nasm -f elf64 -o syscall.o syscall.s
gcc -nostdlib -no-pie -Wl,--entry=main -Wl,--build-id=none main_string.o os_string.o syscall.o -o main_string

student@os:~/.../common-functions/support/src$ ./main_string
Destination string is: warhammer40k

student@os:~/.../common-functions/support/src$ strace ./main_string
execve("./main_string", ["./main_string"], 0x7ffd544d0a70 /- 63 vars */) = 0
write(1, "Destination string is: ", 23Destination string is: ) = 23
write(1, "warhammer40k\n", 13warhammer40k
) = 13
exit(0) = ?
+++ exited with 0 +++

When using strace we see that only the write() system call wrapper triggers a system call. There are no system calls triggered by os_strlen() and os_strcpy() as can be seen in their implementation.

In addition, main_printf.c depends on the printf.h and printf.c files that implement the printf() function. There is a requirement to implement the _putchar() function; we implement it in the main_printf.c file using the write() syscall call wrapper. The main() function main_printf.c file contains all the string and printing calls. printf() offers a more powerful printing interface, allowing us to print addresses and integers.

Let's build and run the program:

student@os:~/.../common-functions/support$ make main_printf
gcc -fno-PIC -fno-stack-protector -c -o main_printf.o main_printf.c
gcc -fno-PIC -fno-stack-protector -c -o printf.o printf.c
gcc -no-pie main_printf.o printf.o syscall.o -o main_printf

student@os:~/.../common-functions/support$ ./main_printf
[before] src is at 00000000004026A0, len is 12, content: "warhammer40k"
[before] dest is at 0000000000603000, len is 0, content: ""
copying src to dest
[after] src is at 00000000004026A0, len is 12, content: "warhammer40k"
[after] dest is at 0000000000603000, len is 12, content: "warhammer40k"

student@os:~/.../common-functions/support$ strace ./main_printf
[...]
write(1, "[", 1[) = 1
write(1, "b", 1b) = 1
write(1, "e", 1e) = 1
write(1, "f", 1f) = 1
write(1, "o", 1o) = 1
write(1, "r", 1r) = 1
write(1, "e", 1e) = 1
write(1, "]", 1]) = 1
[...]

We see that we have greater printing flexibility with the printf() function. However, one downside of the current implementation is that it makes a system call for each character. This is inefficient and could be improved by printing a whole string.

Libraries and libc

Once we have common functions implemented, we can reuse them at any time. The main unit for software reusability is the library. In short, a library is a common machine code that can be linked against different other software components. Each time we want to use the printf() function or the strlen() function, we don't need to reimplement them. We also don't need to use existing source code files, rebuild them and reuse them. We (re)use existing machine code in libraries.

A library is a collection of object files that export given data structures and functions to be used by other programs. We create a program, we compile and then we link it against the library for all the features it provides.

The most important library in modern operating systems is the standard C library, also called libc. This is the library providing system call wrappers and basic functionality for input-output, string management, memory management. By default, a program is always linked with the standard C library. In the examples above, we've explicitly disabled the use of the standard C library with the help of the -nostdlib linker option.

By using the standard C library, it's much easier to create new programs. You call existing functionality in the library and implement only features particular to your program.

The chapters/software-stack/libc/drills/tasks/libc/support/ folder stores the implementation of programs using the standard C library: hello.c, main_string.c and main_printf.c. These programs are almost identical to those used in the past sections:

  • hello.c is similar to the programs in chapters/software-stack/system-calls/drills/tasks/basic-syscall/solution/ and chapters/software-stack/system-calls/drills/tasks/syscall-wrapper/solution/
  • main_string.c and main_printf.c are similar to the programs in chapters/software-stack/libc/drills/tasks/common-functions/solution/

Let's build and run them:

student@os:~/.../libc/support$ ls
hello hello.c hello.o main_printf main_printf.c main_printf.o main_string main_string.c main_string.o Makefile

student@os:~/.../libc/support$ make clean
rm -f hello hello.o
rm -f main_printf main_printf.o
rm -f main_string main_string.o

student@os:~/.../libc/support$ ls
hello.c main_printf.c main_string.c Makefile

student@os:~/.../libc/support$ make
cc -Wall -c -o hello.o hello.c
cc -static hello.o -o hello
cc -Wall -c -o main_printf.o main_printf.c
cc -static main_printf.o -o main_printf
cc -Wall -c -o main_string.o main_string.c
cc -static main_string.o -o main_string

student@os:~/.../libc/support$ ls
hello hello.c hello.o main_printf main_printf.c main_printf.o main_string main_string.c main_string.o Makefile

student@os:~/.../libc/support$ ./hello
Hello, world!
Bye, world!
aaa
aaa
^C

student@os:~/.../libc/support$ ./main_string
Destination string is: warhammer40k

student@os:~/.../libc/support$ ./main_printf
[before] src is at 0x492308, len is 12, content: "warhammer40k"
[before] dest is at 0x6bb340, len is 0, content: ""
copying src to dest
[after] src is at 0x492308, len is 12, content: "warhammer40k"
[after] dest is at 0x6bb340, len is 12, content: "warhammer40k"
abc

The behavior / output is similar to the ones in the previous sections:

student@os:~/.../libc/support$ ../../solution/basic-syscall/hello-nasm
Hello, world!
Bye, world!
aaa
aaa
^C

student@os:~/.../libc/support$ ../../solution/common-functions/main_string
Destination string is: warhammer40k

student@os:~/.../libc/support$ ../../solution/common-functions/main_printf
[before] src is at 0000000000402680, len is 12, content: "warhammer40k"
[before] dest is at 0000000000604000, len is 0, content: ""
copying src to dest
[after] src is at 0000000000402680, len is 12, content: "warhammer40k"
[after] dest is at 0000000000604000, len is 12, content: "warhammer40k"
abc

We can inspect the system calls made to check the similarities. For example, for the main_printf program we get the outputs:

student@os:~/.../libc/support$ strace ./main_printf
execve("./main_printf", ["./main_printf"], 0x7fff7b38c240 /- 66 vars */) = 0
brk(NULL) = 0x15af000
brk(0x15b01c0) = 0x15b01c0
arch_prctl(ARCH_SET_FS, 0x15af880) = 0
uname({sysname="Linux", nodename="[...]", ...}) = 0
readlink("/proc/self/exe", "[...]/operating"..., 4096) = 105
brk(0x15d11c0) = 0x15d11c0
brk(0x15d2000) = 0x15d2000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 18), ...}) = 0
write(1, "[before] src is at 0x492308, len"..., 64[before] src is at 0x492308, len is 12, content: "warhammer40k"
) = 64
write(1, "[before] dest is at 0x6bb340, le"..., 52[before] dest is at 0x6bb340, len is 0, content: ""
) = 52
write(1, "copying src to dest\n", 20copying src to dest
) = 20
write(1, "[after] src is at 0x492308, len "..., 63[after] src is at 0x492308, len is 12, content: "warhammer40k"
) = 63
write(1, "[after] dest is at 0x6bb340, len"..., 64[after] dest is at 0x6bb340, len is 12, content: "warhammer40k"
) = 64
write(1, "ab", 2ab) = 2
write(1, "c\n", 2c
) = 2
exit_group(0) = ?
+++ exited with 0 +++

student@os:~/.../libc/support$ strace ../../solution/common-functions/main_printf
execve("../../solution/common-functions/main_printf", ["../../solution/common-functions/"...], 0x7ffe204eec00 /- 66 vars */) = 0
write(1, "[before] src is at 0000000000402"..., 72[before] src is at 0000000000402680, len is 12, content: "warhammer40k"
) = 72
write(1, "[before] dest is at 000000000060"..., 60[before] dest is at 0000000000604000, len is 0, content: ""
) = 60
write(1, "copying src to dest\n", 20copying src to dest
) = 20
write(1, "[after] src is at 00000000004026"..., 71[after] src is at 0000000000402680, len is 12, content: "warhammer40k"
) = 71
write(1, "[after] dest is at 0000000000604"..., 72[after] dest is at 0000000000604000, len is 12, content: "warhammer40k"
) = 72
write(1, "ab", 2ab) = 2
write(1, "c\n", 2c
) = 2
exit(0) = ?
+++ exited with 0 +++

The output is similar, with differences at the beginning and the end of the system call trace. In the case of the libc-built program, a series of additional system calls (brk, arch_prctl, uname etc.) are made. Also, there is an implicit call to exit_group instead of an explicit one to exit in the non-libc case. These are initialization and cleanup routines that are implicitly added when using the standard C library. They are generally used for setting and cleaning up the stack, environment variables and other pieces of information required by the program or the standard C library itself.

We could argue that the initialization steps incur overhead, and that's a downside of using the standard C library. However, these initialization steps are required for almost all programs. And, given that almost all programs make use of the basic features of the standard C library, libc is almost always used. We can say the above were exceptions to the rule, where we didn't make use of the standard C library.

Summarizing, the advantages and disadvantages of using the standard C library are:

  • (+) easier development: do calls to existing functions already implemented in the standard C library; default build and link flags
  • (+) portability: if the system provides a standard C library, one calls the library functions that will then interact with the lower-layer API
  • (+) implicit initialization and cleanup: no need for you do explicitly create them
  • (-) usually larger in size (static) executables
  • (-) a level of overhead as the standard C library wraps system calls
  • (-) potential security issues: a larger set of (potentially vulnerable) functions are presented by the standard C library

High-Level Languages

Using the standard C library (libc) frees the programmer from the cumbersome steps of invoking system calls and reimplementing common features. Still, for improved development time and safety, other programming languages can be used, such as Rust, Python, JavaScript. Most (if not all) of these high-level programming languages still make use of the standard C library. Such that a call to a function in Python would end-up making a call to a function in the standard C library.

The chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/support/ folder stores the implementation of a simple "Hello, World!"-printing program in Python. We simply invoke the python interpreter to run the program:

student@os:~/.../high-level-lang/support$ python hello.py
Hello, world!

We count the number of functions called from the standard C library and the number of system calls:

student@os:~/.../high-level-lang/support$ ltrace -l 'libc*' python hello.py 2> libc.out
Hello, world!

student@os:~/.../high-level-lang/support$ wc -l libc.out
50469 out

student@os:~/.../high-level-lang/support$ strace python hello.py 2> syscall.out
Hello, world!

student@os:~/.../high-level-lang/support$ wc -l syscall.out
948 syscall.out

The dynamic standard C library (libc.so.6) is a dependency of the Python interpreter (/usr/bin/python3):

student@os:~/.../high-level-lang/support$ ldd /usr/bin/python3
[...]
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa6fd6d0000)
[...]

We can see the complexity of invoking the Python interpreter, resulting in more the 50,000 of library calls being made. This means added overhead versus a simple C function. However, this also means faster development in the Python programming language. Each new layer in the software stack simplifies development but adds overhead.

We can use perf to compare the running time between the Python and a C "Hello, World!"-printing programs:

student@os:~/.../high-level-lang/support$ sudo perf stat ../static-dynamic/hello
Hello, World!

Performance counter stats for '../static-dynamic/hello':

0.46 msec task-clock # 0.559 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
52 page-faults # 0.114 M/sec
859,341 cycles # 1.882 GHz
713,395 instructions # 0.83 insn per cycle
141,710 branches # 310.393 M/sec
6,208 branch-misses # 4.38% of all branches

0.000816974 seconds time elapsed

0.000872000 seconds user
0.000000000 seconds sys

student@os:~/.../high-level-lang/support$ sudo perf stat python hello.py
Hello, world!

Performance counter stats for 'python hello.py':

69.39 msec task-clock # 0.992 CPUs utilized
2 context-switches # 0.029 K/sec
0 cpu-migrations # 0.000 K/sec
1,115 page-faults # 0.016 M/sec
74,405,125 cycles # 1.072 GHz
84,957,056 instructions # 1.14 insn per cycle
18,574,724 branches # 267.689 M/sec
759,104 branch-misses # 4.09% of all branches

0.069981351 seconds time elapsed

0.054376000 seconds user
0.015536000 seconds sys

We can see that on all metrics, the running of the Python program is less efficient than the running of the C program. The Python code takes 69 milliseconds, whereas the C code runs in less than 1 millisecond.

When deciding what programming language and what libraries and software components to use, you have to balance requirements for fast development and increased safety (inherent to higher-level programming languages) with requirements for speed or efficiency (common to lower-level programming languages such as C). Newer modern programming languages such as Go, Rust, D aim to add the benefits of high-level programming languages and keep efficiency close to the C programming language. Generally, additional software layers (libraries, language environments, interpreters) simplify development but decrease speed and efficiency.

App Investigation

Let's spend some time investigating actual applications residing on the local system. For now, we know that applications are developed using high-level languages and then compiled or interpreted to use the lower-layer interfaces of the software stack, such as the system call API.

Let's enter the chapters/software-stack/applications/drills/tasks/app-investigation/support/ folder and run the get_app_types.sh script:

student@os:~/.../app-investigation/support/$ ./get_app_types.sh
binary apps: 2223
Perl apps: 256
Shell apps: 454
Python apps: 123
Other apps: 27

The script prints the types of the application executables in the system. The output will differ between systems, given each has particular types of applications installed.

We list them by running the command inside the get_app_types.sh script:

student@os:~/.../app-investigation/support/$ find /usr/bin /bin /usr/sbin /sbin -type f -exec file {} \;
[...]
/usr/bin/qpdldecode: ELF 64-bit LSB shared object, x86-64 [...]
/usr/bin/mimeopen: Perl script text executable
[...]

As above, the output will differ between systems.

So, depending on the developers' choice, applications may be:

  • compiled into executables, from compiled languages such as C, C++, Go, Rust, D
  • developed as scripts, from interpreted languages such as Python, Perl, JavaScript